Parameterized reinforcement learning for optical system optimization
نویسندگان
چکیده
Abstract Engineering a physical system to feature designated characteristics states an inverse design problem, which is often determined by several discrete and continuous parameters. If such must particular behavior, the mentioned combination of both, continuous, parameters results in challenging optimization problem that requires extensive search for optimal design. However, if corresponding can be reformulated as parameterized Markov decision process, reinforcement learning (RL) provides heuristic framework solve it. In this work, we use multi-layer thin films example aforementioned problems consider three parameters: Each film layer’s dielectric material (discrete) thickness (continuous), well total number layers (discrete). While recent methods merely determine thicknesses and—less commonly—the layers’ materials, our approach optimizes stacked well. summary, further develop Q-learning variant thereby outperform human experts current approaches like needle-point or naive RL. For purpose, propose exponentially transformed reward signal eases policy enables constrained optimization. Moreover, learned Q-values contain information about optical properties films, allows us interpretation what-if analysis thus explainability.
منابع مشابه
Reinforcement Learning in Parameterized Models Reinforcement Learning with Polynomial Learning Rate in Parameterized Models
We consider reinforcement learning in a parameterized setup, where the model is known to belong to a finite set of Markov Decision Processes (MDPs) under the discounted return criterion. We propose an on-line algorithm for learning in such parameterized models, the Parameter Elimination (PEL) algorithm, and analyze its performance in terms of the total mistakes. The algorithm relies on Wald’s s...
متن کاملReinforcement Learning for Parameterized Motor Primitives [IJCNN1759]
One of the major challenges in both action generation for robotics and in the understanding of human motor control is to learn the “building blocks of movement generation”, called motor primitives. Motor primitives, as used in this paper, are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. While a lot of progress has been mad...
متن کاملReinforcement Learning with Parameterized Actions
We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions—discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. We introduce the Q-PAMDP algorithm for learning in these domains, show that it converges to a local optimum, and compare it to direct policy ...
متن کاملActive exploration in parameterized reinforcement learning
Online model-free reinforcement learning (RL) methods with continuous actions are playing a prominent role when dealing with real-world applications such as Robotics. However, when confronted to non-stationary environments, these methods crucially rely on an exploration-exploitation trade-off which is rarely dynamically and automatically adjusted to changes in the environment. Here we propose a...
متن کاملDeep Reinforcement Learning in Parameterized Action Space
Recent work has shown that deep neural networks are capable of approximating both value functions and policies in reinforcement learning domains featuring continuous state and action spaces. However, to the best of our knowledge no previous work has succeeded at using deep neural networks in structured (parameterized) continuous action spaces. To fill this gap, this paper focuses on learning wi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Physics D
سال: 2021
ISSN: ['1361-6463', '0022-3727']
DOI: https://doi.org/10.1088/1361-6463/abfddb